[GLUTEN-6887][VL] Daily Update Velox Version (2026_04_01)#11860
[GLUTEN-6887][VL] Daily Update Velox Version (2026_04_01)#11860zhouyuan merged 15 commits intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
f02193f to
82d1a35
Compare
|
Run Gluten Clickhouse CI on x86 |
| "/test-data/parquet-thrift-compat.snappy.parquet" | ||
|
|
||
| testGluten("Read Parquet file generated by parquet-thrift") { | ||
| // TODO: https://github.com/apache/gluten/issues/11865 |
There was a problem hiding this comment.
@baibaichen seems due to missing fix from one old OAP patch: https://github.com/IBM/velox/pull/35/changes
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
| protected val tablesPath: String = UTSystemParameters.tpcdsDecimalDataPath + "/" | ||
| protected val db_name: String = "tpcdsdb" | ||
| // TODO: fix to use the new DS queries https://github.com/apache/gluten/issues/11871 | ||
| protected val tpcdsQueries: String = |
|
Run Gluten Clickhouse CI on x86 |
3 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
07d2ba4 to
ef679c9
Compare
|
Run Gluten Clickhouse CI on x86 |
ef679c9 to
3ab0761
Compare
|
Run Gluten Clickhouse CI on x86 |
Upstream Velox's New Commits: 24e6ab97b by Chengcheng Jin, fix(cudf): Fix complex data type name in format conversion and add tests(Part1) (#16818) d92b90029 by Natasha Sehgal, refactor: Propagate CastRule cost through canCoerce (#16821) 361a42252 by Rui Mo, fix(fuzzer): Reduce Spark aggregate fuzzer test pressure (#16964) 2c2fe2ab7 by root, fix: Ignore string column statistics for parquet-mr versions before 1.8.2 (#16744) 7faf27a86 by Chengcheng Jin, feat(cudf): Add the log to show detailed fallback messgae (#16900) e603315e5 by Chang chen, feat(parquet): Add type widening support for INT and Decimal types with configurable narrowing (#16611) 1e1674dd8 by Rajeev Singh, docs: Add blog post for Adaptive per-function CPU tracking (#16945) 0c6b89d61 by Masha Basmanova, fix(build): Guard fuzzer examples subdirectory with VELOX_BUILD_TESTING (#16992) 8d6355d8d by Pratik Pugalia, build: Improve build impact comment layout (#16971) 44d561990 by Masha Basmanova, refactor: Add ConnectorRegistry class with tryGet and unregisterAll (#16977) 793f13f16 by Rajeev Singh, feat(expr-eval):Adaptive per-function CPU sampling for Velox expression evaluation (#16646) 1a4dc7a5a by Pratik Pugalia, fix: Off-by-one boundary bug in make_timestamp validation (#16944) 7f2c75c26 by Pratik Pugalia, Fix incorrect substr length in Tokenizer::matchUnquotedSubscript (#16972) 22b90045e by Masha Basmanova, docs: Add truncate markers to blog posts for cleaner listing page (#16975) Signed-off-by: glutenperfbot <glutenperfbot@glutenproject-internal.com>
…olumns When Gluten creates HiveTableHandle, it was passing all columns (including partition columns) as dataColumns. This caused Velox's convertType() to validate partition column types against the Parquet file's physical types, failing when they differ (e.g., LongType in file vs IntegerType from partition inference). Fix: build dataColumns excluding partition columns (ColumnType::kPartitionKey). Partition column values come from the partition path, not from the file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With OAP INT narrowing commit replaced by upstream Velox PR #15173: - Remove 2 excludes now passing: LongType->IntegerType, LongType->DateType - Add 2 excludes for new failures: IntegerType->ShortType (OAP removed) Exclude 63 (net unchanged: -2 +2). Test results: 21 pass / 63 ignored. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This suite tests the READ path only. Disable native writer so Spark's writer produces correct V2 encodings (DELTA_BINARY_PACKED/DELTA_BYTE_ARRAY). - Remove 10 excludes for decimal widening tests now passing Remaining 38 excludes: - 34: Velox native reader rejects incompatible decimal conversions regardless of reader config (no parquet-mr fallback) - 4: Velox does not support DELTA_BYTE_ARRAY encoding Test results: 46 pass / 38 ignored. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Velox native reader always behaves like Spark's vectorized reader, so tests that rely on parquet-mr behavior (vectorized=false) fail. Instead of just excluding these 33 tests, add testGluten overrides with expectError=true to verify Velox correctly rejects incompatible conversions. - 16 unsupported INT->Decimal conversions - 6 decimal precision narrowing cases - 11 decimal precision+scale narrowing/mixed cases VeloxTestSettings: 38 excludes (parent tests) + 33 testGluten overrides Test results: 79 pass / 38 ignored (33 excluded parent + 5 truly excluded)
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
the testing data on clickhouse side is not upated, so revert to use the old query Signed-off-by: Yuan <yuanzhou@apache.org>
3ab0761 to
e4499a6
Compare
|
Run Gluten Clickhouse CI on x86 |
|
|
||
| test("Eliminate two aggregate joins with attribute reordered") { | ||
| ignore("Eliminate two aggregate joins with attribute reordered") { | ||
| val sql = """ |
There was a problem hiding this comment.
@zzcclp this test failed, it's not related with this patch, seems due to the recent changes in the past two weeks
There was a problem hiding this comment.
I will take a look next week.
Upstream Velox's New Commits:
velox_branch: https://github.com/IBM/velox/commits/dft-2026_04_01
Related issue: #6887